77
Chapter 3
rotease Cleavage Pattern Discovery
protein functions when it interacts with molecules or
emicals. Among many formats of interactions, protease
avage is one of the widely researched subjects for several
cades. This type of research aims to build up a predictive
del based on collected laboratory data to discover novel
eractions. Such a model is commonly established based on
oratory-verified protease cleavage data, in which the
ociation knowledge between protease cleavage structure
d protease cleavage function can be examined. A protease
avage structure commonly means a primary sequence (or
b-sequence or peptide) which is believed to contain a
ecific amino acid composition pattern or trend in
ationship with the protease cleavage function. In other
rds, the protease cleavage pattern must not show a random
ino acid composition. Instead, the composition of the
ino acids in a data set of protease cleaved peptides should
monstrate a trend for a specific protease to recognise for
interaction. To make a protease cleavage pattern
covery model to work efficiently, two types of peptides
collected and pooled together for constructing a model.
ey are the cleaved peptides and the non-cleaved peptides.
n-cleaved peptides definitely must have no trend of the
ino acid composition at all. Instead, they must show
ndom distribution of the amino acids. By the contrast
mparison between a data set with a trend and a data set
thout any trend, a pattern by which two types of data can